Duration modeling for Chinese synthesis from C-toBI labeled corpus
نویسندگان
چکیده
A set of labeling criteria, C-ToBI (Chinese Tone and Break Index) was redefined to annotate the prosodic event in continuous speech in a hierarchical structure. There’re 4 layers, i.e., intonational phrase, intermediate phrase, word, and syllable layer. The prosodic structure and break index and stress index tiers represent the core prosodic events of an utterance. The stress index represents the degree of accent of the constituents in each layer. The break tier represents the degree of the juncture of each pair of constituents in each layer. A duration model was built from a reading style corpus labeled with CToBI. The factors affecting the duration of a given segment come from two relatively independent levels. First, in segment level, the phoneme of the segment and the context do influence the duration. Second, in super-segment level, the influences come from multi-layers, which include the location and the degree of stress and break in different layers. Those factors with the property of directional invariance form the feature vector that was as the input of the linear duration model. And the model was part of a synthesis speech system, and its parameters were estimated by the statistic approach.
منابع مشابه
Duration Modeling for Chinese Systhesis from C-tobi Labeled Corpus
A set of labeling criteria, C-ToBI (Chinese Tone and Break Index) was redefined to annotate the prosodic event in continuous speech in a hierarchical structure. There’re 4 layers, i.e., intonational phrase, intermediate phrase, word, and syllable layer. The prosodic structure and break index and stress index tiers represent the core prosodic events of an utterance. The stress index represents t...
متن کاملSpeech corpus of Chinese discourse and the phonetic research
Speech corpus of Chinese discourse (ASCCD) was setup and annotated on segmental and prosodic and syntactic tiers. SAMPA-C and C-ToBI conventions are used for segmental and prosodic labeling. Sound variation such as assimilation, insertion and deletion are investigated on the labeled database. The prosodic research focuses on the sentence stress that involves the specification of relative promin...
متن کاملAutomatic labeling of Japanese prosody using j-toBI style description
Speech corpora with prosodic labels are getting more and more important not only for speech synthesis but also for discourse modeling. A widely used labeling system for Japanese prosody, J-ToBI, however, is insufficient for applications like discourse modeling and it even lacks an accurate method for automatic labeling. In this paper, we propose an automatic labeling method for J-ToBI style des...
متن کاملProsody recognition from speech utterances using acoustic and linguistic based models of prosodic events
A system for automatic recognition of prosodic events in speech utterances has been developed and applied to recognizing accent tones as de ned by the tone and break index (ToBI) prosodic labeling standard. Both the acoustic and syntactic modeling portions of the system are described in the paper. The acoustic modeling portion of the system involves representation of ToBI labeled events using h...
متن کاملToBI Prosodic Analysis of a Professional Speaker of American English
We analyze the distribution of ToBI labels in a corpus collected from a professional speaker for use in concatenative speech synthesis. Our goals include using such statistics to aid automatic ToBI labeling of such a corpus, analogously to how a language model aids speech recognition. We find that the professional speaker produces a rich variety of prosodic events. ToBI labels occur with skewed...
متن کامل